Search CORE

18 research outputs found

Deriving Verb Predicates By Clustering Verbs with Arguments

Author: Rouhizadeh Masoud
Schwartz Andy
Sedoc Joao
Ungar Lyle
Wijaya Derry
Publication venue
Publication date: 01/01/2017
Field of study

Hand-built verb clusters such as the widely used Levin classes (Levin, 1993) have proved useful, but have limited coverage. Verb classes automatically induced from corpus data such as those from VerbKB (Wijaya, 2016), on the other hand, can give clusters with much larger coverage, and can be adapted to specific corpora such as Twitter. We present a method for clustering the outputs of VerbKB: verbs with their multiple argument types, e.g. "marry(person, person)", "feel(person, emotion)." We make use of a novel low-dimensional embedding of verbs and their arguments to produce high quality clusters in which the same verb can be in different clusters depending on its argument type. The resulting verb clusters do a better job than hand-built clusters of predicting sarcasm, sentiment, and locus of control in tweets

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Recommended from our members

Collecting Semantic Data by Mechanical Turk for the Lexical Knowledge Resource of a Text-to-Picture Generating System

Author: Bowler Margit
Coyne Robert Eric
Rouhizadeh Masoud
Sproat Richard
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

WordsEye is a system for automatically converting natural language text into 3D scenes representing the meaning of that text. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical and real-world knowledge needed to depict scenes from text. To enrich a portion of the SBLR, we need to fill out some contextual information about its objects, including information about their typical parts, typical locations and typical objects located near them. This paper explores our proposed methodology to achieve this goal. First we try to collect some semantic information by using Amazon’s Mechanical Turk (AMT). Then, we manually filter and classify the collected data and finally, we compare the manual results with the output of some automatic filtration techniques which use several WordNet similarity and corpus association measures

Columbia University Academic Commons

Recommended from our members

Data Collection and Normalization for Building the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System

Author: Bowler Margit
Coyne Robert Eric
Rouhizadeh Masoud
Sproat Richard
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

WordsEye is a system for converting from English text into three-dimensional graphical scenes that represent that text. It works by performing syntactic and semantic analyses on the input text, producing a description of the arrangement of objects in a scene. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical and real-world knowledge needed to depict scenes from text. This paper explores information collection methods for building the SBLR, using Amazon’s Mechanical Turk (AMT) and manual normalization of raw AMT data. The paper follows with manual review of existing relations in the SBLR and classification of the AMT data into existing and new semantic relations. Since manual annotation is a time-consuming and expensive approach, we also explored the use of automatic normalization of AMT data through log-odds and log-likelihood ratios extracted from the English Gigaword corpus, as well as through WordNet similarity measures

Columbia University Academic Commons

Recommended from our members

Collecting Spatial Information for Locations in a Text-to-Scene Conversion System

Author: Bauer Daniel
Coyne Robert Eric
Rambow Owen C.
Rouhizadeh Masoud
Sproat Richard
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

We investigate using Amazon Mechanical Turk (AMT) for building a low-level description corpus and populating VigNet, a comprehensive semantic resource that we will use in a text-to-scene generation system. To depict a picture of a location, VigNet should contain the knowledge about the typical objects in that location and the arrangements of those objects. Such information is mostly common-sense knowledge that is taken for granted by human beings and is not stated in existing lexical resources and in text corpora. In this paper we focus on collecting objects of locations using AMT. Our results show that it is a promising approach

Columbia University Academic Commons

Recommended from our members

Annotation Tools and Knowledge Representation for a Text-To-Scene System

Author: Bauer Daniel
Coyne Robert Eric
Klapheke Alexander
Rouhizadeh Masoud
Sproat Richard
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Text-to-scene conversion requires knowledge about how actions and locations are expressed in language and realized in the world. To provide this knowlege, we are creating a lexical resource (VigNet) that extends FrameNet by creating a set of intermediate frames (vignettes) that bridge between the high-level semantics of FrameNet frames and a new set of low-level primitive graphical frames. Vignettes can be thought of as a link between function and form – between what a scene means and what it looks like. In this paper, we describe the set of primitive graphical frames and the functional properties of 3D objects (affordances) we use in this decomposition. We examine the methods and tools we have developed to populate VigNet with a large number of action and location vignettes

Columbia University Academic Commons

Data collection and normalization for building the scenario-based lexical knowledge resource of a text-to-scene conversion system

Author: Bob Coyne
Margit Bowler
Masoud Rouhizadeh
Richard Sproat
Publication venue
Publication date: 01/01/2010
Field of study

WordsEye is a system for converting from English text into three-dimensional graphical scenes that represent that text. It works by performing syntactic and semantic analyses on the input text, producing a description of the arrangement of objects in a scene. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical and real-world knowledge needed to depict scenes from text. This paper explores information collection methods for building the SBLR, using Amazon’s Mechanical Turk (AMT) and manual normalization of raw AMT data. The paper follows with manual review of existing relations in the SBLR and classification of the AMT data into existing and new semantic relations. Since manual annotation is a time-consuming and expensive approach, we also explored the use of automatic normalization of AMT data through logodds and log-likelihood ratios extracted from the English Gigaword corpus, as well as through WordNet similarity measures. 1

CiteSeerX

Crossref